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DETAILED ACTION 

1 . This Office Action is in response to the amendment filed December 19, 2008. With the 
amendment filed December 19, 2008, applicant has amended independent claims 1,11, and 21 
and added new claims 29-32. Currently, claims 1, 4-5, 7-11, 14-15, 17-23, and 29-32 are 
pending. 

Claim Rejections - 35 USC § 103 

The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 

obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in 
section 102 of this title, if the differences between the subject matter sought to be patented and the prior art are 
such that the subject matter as a whole would have been obvious at the time the invention was made to a person 
having ordinary skill in the art to which said subject matter pertains. Patentability shall not be negatived by the 
manner in which the invention was made. 

2. Claims 1, 4-5, 7-9, 10-11, 14-15, and 17-23 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Lumelsky in view of Applicant's Admitted Prior Art (AAPA). 

3. Lumelsky discloses a text-to-speech and prosody based authoring system, which includes 
a speech analyzer responsive to a spoken utterance. The speech analyzer generates a speech 
signal representative of one or more prosodic parameters associated with a speaker. A text-to 
speech converter, responsive to a text signal generates a phonetic representation signal from the 
text signal and synthesizes a speech signal from the text signal. 

4. Regarding claims 1,4-5, 11, 14-15, and 21, Lumelsky discloses a system and program 
storage device readable by a machine, tangibly embodying a program of instructions executable 
by the machine to perform method steps for speech synthesis, the method steps comprising: 
determining prosodic parameters of a spoken utterance; automatically generating a marked-up 
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text corresponding to the spoken utterance using the prosodic parameters; and generating a 
synthetic waveform using the mark-up text (col. 8, line 52 continuing to col. 17, line 16). 
Lumelsky discloses the user specifies the pronunciation of the text string (col. 10, line 49 to col. 
12, line 25 — the system synthesizes the voice, using one or more recorded allophone 
dictionaries which may be individually selected by the user. Because several dictionaries are 
available, the allophones recorded in the dictionaries define the preferred narrator voices, one of 
which may be chosen by the user, such that the user may preselect, the type of "voice" he wishes 
to have narrate the requested information and, depending on the selection, the appropriate 
allophone dictionary is used to speech synthesize the information. Additionally, Lumelsky 
indicates the system operates to generate prosody parameters, based on individual speech, and 
then use them during the speech synthesis at the user terminal. The prosody parameters are 
obtained by processing the speech signal submitted by the narrator). Lumelsky discloses the 
instructions for determining prosodic parameters comprise instructions for determining pitch 
contour, duration contour or energy contour information of the spoken utterance, or any 
combination thereof (col. 8, line 52 continuing to col. 17, line 16). Lumelsky does not 
specifically teach an alignment process for aligning the spoken utterance with a corresponding 
text string. However, aligning a spoken utterance with a corresponding text string was well 
known in the art. Applicant's admitted prior art (AAPA) specifically indicates implementation 
of Viterbi alignment was well known in the art. It would have been obvious to one of ordinary 
skill at the time of the invention to implement alignment processing in the system of Lumelsky, 
for the purpose of providing improved marked text with appropriate prosodic information so as 
to generate more natural and realistic synthetic speech. 
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Regarding claims 7 and 17, Lumelsky discloses instructions and methods for 
automatically generating a marked-up text comprises instructions and methods for directly 
specifying the prosodic parameters as attribute values for mark-up elements (col. 8, line 52 
continuing to col. 17, line 16). 

Regarding claims 8 and 18, Lumelsky discloses instructions and methods for 
automatically generating a marked-up text comprises instructions and methods for assigning 
abstract labels to the prosodic parameters to generate a high-level mark-up (col. 8, line 52 
continuing to col. 17, line 16). 

Regarding claims 10, 20 and 23, Lumelsky discloses processing phonetic content of the 
spoken utterance to generate the synthetic waveform having a desired pronunciation (col. 8, line 
52 continuing to col. 17, line 16). 

Regarding claims 9 and 19, Lumelsky does not specifically teach the marked-up text is 
generated using SSML (speech synthesis markup language). AAPA specifically indicates 
implementation of SSML for use on the Internet was known. It would have been obvious to one 
of ordinary skill at the time of the invention to implement SSML in the system of Lumelsky, for 
the purpose of providing high quality synthetic speech for use with Internet applications and 
resources. 

Regarding claim 22, Lumelsky discloses a user interface that enables a user to input the 
spoken utterance and input a text string corresponding to the spoken utterance (Figure 1, element 
101). 
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Claims 29-32 are rejected under 35 U.S.C. 103(a) as being unpatentable over Lumelsky 
in view of Applicant's Admitted Prior Art (AAPA), as applied to claims 1 and 1 1 above, 
and further in view of Saon et al, "Maximum Likelihood Discriminant Feature Spaces," 2000, 
IEEE International Conference on Acoustics, Speech, and Signal Processing, Volume 2, 5-9 June 
2000, pages 1129-1132. 

5. Lumelsky discloses a text-to-speech and prosody based authoring system, which includes 
a speech analyzer responsive to a spoken utterance. The speech analyzer generates a speech 
signal representative of one or more prosodic parameters associated with a speaker. A text-to 
speech converter, responsive to a text signal generates a phonetic representation signal from the 
text signal and synthesizes a speech signal from the text signal. 

6. Regarding claims 29-32, the combination of Lumelsky and AAPA teach everything as 
claimed in claims 1 and 1 1 . Lumelsky does not disclose all the details for extracting acoustic 
information from the audio signal to include transforming the digitized input waveforms into a 
set of feature vectors on a frame-by-frame basis by producing a multi-dimensional cepstra 
feature vector for a predetermined intervals of the spoken audio signal, concatenating frames to 
the left and to the right of a current frame to augment a current cepstral vector, and reducing the 
dimension of each augmented cepstral vector using linear discriminant analysis. However, 
extracting cepstra features, splicing the frames and using linear discriminant analysis for 
dimensionality reduction was well known in speech and signal processing so as to obtain the best 
quality features generated with minimal loss in discrimination when the vectors dimensionality is 
reduced. Saon discloses a speech processing application which extracts acoustic information 
from voicemail messages which processes the audio signals to produce feature vectors of 
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cepstral, delta and delta-delta coefficients from 9 consecutive frames, where the 9 consecutive 
24-dimensional vectors were spliced together to form 2 1 6-dimensional feature vectors, which are 
subsequently reduced by applying the LDA. It would have been obvious to one of ordinary skill 
at the time of the invention to modify the system of Lumelsky to implement producing cepstra 
feature vectors, as was well known in the art, for the purpose of generating quality coefficients to 
be used in the system processing so as to ensure the best quality speech is synthesized. 

Response to Arguments 

7. Applicant's arguments with respect to claims 1,11, and 21 have been considered but are 
moot in view of the new ground(s) of rejection. 

Conclusion 

8. Applicant's amendment necessitated the new ground(s) of rejection presented in this 
Office action. Accordingly, THIS ACTION IS MADE FINAL. See MPEP § 706.07(a). 
Applicant is reminded of the extension of time policy as set forth in 37 CFR 1.136(a). 

A shortened statutory period for reply to this final action is set to expire THREE 
MONTHS from the mailing date of this action. In the event a first reply is filed within TWO 
MONTHS of the mailing date of this final action and the advisory action is not mailed until after 
the end of the THREE-MONTH shortened statutory period, then the shortened statutory period 
will expire on the date the advisory action is mailed, and any extension fee pursuant to 37 
CFR 1 .136(a) will be calculated from the mailing date of the advisory action. In no event, 
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however, will the statutory period for reply expire later than SIX MONTHS from the date of this 
final action. 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to ANGELA A. ARMSTRONG whose telephone number is 
(571)272-7598. The examiner can normally be reached on Monday-Thursday 1 1 :30-8:00 PM. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Patrick N. Edouard can be reached on 571-272-7603. The fax phone number for the 
organization where this application or proceeding is assigned is 571-273-8300. 

Information regarding the status of an application may be obtained from the Patent 
Application Information Retrieval (PAIR) system. Status information for published applications 
may be obtained from either Private PAIR or Public PAIR. Status information for unpublished 
applications is available through Private PAIR only. For more information about the PAIR 
system, see http://pair-direct.uspto.gov. Should you have questions on access to the Private PAIR 
system, contact the Electronic Business Center (EBC) at 866-217-9197 (toll-free). If you would 
like assistance from a USPTO Customer Service Representative or access to the automated 
information system, call 800-786-9199 (IN USA OR CANADA) or 571-272-1000. 

/Angela A Armstrong/ 

Primary Examiner, Art Unit 2626 



